Why simulate MRI?

MRI has a few challenges

  • MRI physics are complex and solutions often non-intuitive → teaching.
  • MRI scans are costly → slow feedback loops in pulse sequence design.
  • Sharing MRI data is complicated:
    • Ethics
    • Patient privacy issues
    • No large datasets are available (FastMRI ~9,012)

What are others doing?

MRI, like robotics, has high trial and error cost:

Can we create a digital playground to answer:

How can we get faster and more useful images?

Optimization of low-field cardiac MRI sequence1

Comparison between optimized sequence and 1.5T

Study in collaboration with Anastasia Fotaki, KCL. To be submitted.

Get quantitative information from signal dictionaries

Free-running T1 and T2 mapping at 0.55T by Diego Pedraza, UC. To be submitted.

What are we simulating?

The Bloch equations

Describe the evolution of the magnetization: \[ \frac{\mathrm{d}\vec{M}}{\mathrm{d}t} = \gamma \vec{M} \times \vec{B}(t) - \left( \frac{M_x}{T_2}, \frac{M_y}{T_2}, \frac{M_z - M_0}{T_1} \right) \]

  • \(\vec{M}\) is the measurement,
  • \(\vec{B}(t)\) is what we control with the pulse sequence,
  • \((M_0, T_1, T_2)\) are tissue properties.

This equation is not really special, it is just an ODE

\[ \Large \frac{\mathrm{d}\vec{M}}{\mathrm{d}t} = \mathrm{bloch}(t, \vec{M}) \]

… but this part makes it special!

How can we solve this for 10,000 spins?

(100x100)

The number of ODEs explodes quite rapidly

How can we solve this for 100,000,000 spins?

(460x460x460)

How are we simulating it?

Operator-splitting method

  • Continuous problem (\(t\)): \[ \begin{align*} \vec{M}_t &= ({{\color{red}A}} + {{\color{blue}B}})\vec{M}\\ \vec{m}_t &= {{\color{red}A}} \vec{m}\\ \vec{m}_t &= {{\color{blue}B}} \vec{m}\\ \end{align*} \]
  • Discrete problem (\(t_n \rightarrow t_{n+1}\)): \[ \begin{align*} \vec{m}^{n+1} &= \mathrm{e}^{{\mathrm{\color{red}A}} \Delta t} \vec{m}^{n}\\ \vec{m}^{n+1} &= \mathrm{e}^{{\mathrm{\color{blue}B}} \Delta t} \vec{m}^{n}\\ \vec{M}^{n+1} &= \mathrm{e}^{{\mathrm{\color{red}A}} \Delta t} \mathrm{e}^{\vec{\mathrm{\color{blue}B}} \Delta t} \vec{M}^{n} \end{align*} \]

KomaMRI.jl2

  • Pulseq3 as sequence definition.
  • MRD4 (previously ISMRMRD) as raw data format.

The Julia programming language5

  • Similar syntax to MATLAB (commonly used in MRI community)
  • Multi-threading support
  • Excellent GPU support6,7
  • GUI support (PlotlyJS.jl and Blink.jl)

Graphical User Interface (GUI)

Results can be exported to .mat.

GPU parallelization

Compatible with Pulseq3

seq = read_seq("spiral_3T.seq")
plot_seq(seq)
[ Info: Loading sequence spiral.seq ...
# Format of blocks:
# NUM DUR RF  GX  GY  GZ  ADC  EXT
[BLOCKS]
1   1621  1  0  0  1  0  0
2    319  2  0  0  2  0  0
3   2245  0  4  5  3  1  0
4    104  0  7  8  6  0  0

# Format of RF events:
# id amplitude mag_id phase_id time_shape_id delay freq phase
# ..        Hz   ....     ....          ....    us   Hz   rad
[RF]
1      129.712 1 2 0 100 -424.504 0
2      329.152 3 4 0 100 0 0

# Format of arbitrary gradients:
#   time_shape_id of 0 means default timing (stepping with grad_raster starting at 1/2 of grad_raster)
# id amplitude amp_shape_id time_shape_id delay
# ..      Hz/m       ..         ..          us
[GRADIENTS]
4      -773910 5 0 790
5       771376 6 0 790
7      -773910 7 8 0
8        32610 7 8 0

# Format of trapezoid gradients:
# id amplitude rise flat fall delay
# ..      Hz/m   us   us   us    us
[TRAP]
 1  1.27714e+06 250 7580 250 8130
 2       444444  90 3000  90  10
 3  -1.2716e+06 250  290 250   0
 6  1.26582e+06 250  540 250   0

# Format of ADC events:
# id num dwell delay freq phase
# ..  ..    ns    us   Hz   rad
[ADC]
1 12000 1700 790 0 0

# Sequence Shapes
[SHAPES]

shape_id 1
num_samples 8000
0.00011671692
0.000117246598
0.000117778547
0.000118312775
...
0.0422751249

shape_id 7
num_samples 2
1
0

shape_id 8
num_samples 2
0
104


[SIGNATURE]
# This is the hash of the Pulseq file, calculated right before the [SIGNATURE] section was added
# It can be reproduced/verified with md5sum if the file trimmed to the position right above [SIGNATURE]
# The new line character preceding [SIGNATURE] BELONGS to the signature (and needs to be sripped away for recalculating/verification)
Type md5
Hash efc5eb7dbaa82aba627a31ff689c8649

Koma achieved excellent accuracy for all tested cases

Koma was significantly faster than MRiLab

CPU GPU1 GPU2
Name Intel i7-1165G7 GTX 1650 Ti RTX 2080 Ti
JEMRIS \(\approx7\,\mathrm{min}\) - -
MRiLab \(1.56\,\mathrm{s} \pm 0.07\,\mathrm{s}\) \(0.84\,\mathrm{s} \pm 0.02\,\mathrm{s}\) \(0.91\,\mathrm{s} \pm 0.02\,\mathrm{s}\)
Koma \(1.82\,\mathrm{s} \pm 0.17\,\mathrm{s}\) \(0.32\,\mathrm{s} \pm 0.02\,\mathrm{s}\) \(0.15\,\mathrm{s} \pm 0.01\,\mathrm{s}\)

What’s new in KomaMRI.jl?

Package separation (Boris Oróstica)

Sub-packages have their own CI

  • KomaMRIBase: Custom types and functions
  • KomaMRICore: Simulation functions
  • KomaMRIFiles: File I/O functions
  • KomaMRIPlots: Plotting functions
  • KomaMRI: User Interface

Enhanced GPU support (Ryan Kierulf)

using KomaMRI
obj = brain_phantom2D()
seq = PulseDesigner.EPI_example()
sys = Scanner()
raw = simulate(obj, seq, sys)
using KomaMRI, CUDA # Loads KomaMRICoreCUDAExt
obj = brain_phantom2D()
seq = PulseDesigner.EPI_example()
sys = Scanner()
raw = simulate(obj, seq, sys)
using KomaMRI, AMDGPU # Loads KomaMRICoreAMDGPUExt
obj = brain_phantom2D()
seq = PulseDesigner.EPI_example()
sys = Scanner()
raw = simulate(obj, seq, sys)
using KomaMRI, Metal # Loads KomaMRICoreMetalExt
obj = brain_phantom2D()
seq = PulseDesigner.EPI_example()
sys = Scanner()
raw = simulate(obj, seq, sys)
using KomaMRI, oneAPI # Loads KomaMRICoreoneAPIExt
obj = brain_phantom2D()
seq = PulseDesigner.EPI_example()
sys = Scanner()
raw = simulate(obj, seq, sys)

Buildkite CI and benchmark setup

Kernel-based GPU programming

We achieved performance improvements and memory reductions.

Distributed computing

Koma is compatible with HPC and SLURM pipelines:

using Distributed, CUDA
addprocs(length(devices())) # One process per GPU
@everywhere begin
    using KomaMRI, CUDA
    sys = Scanner()
    seq = PulseDesigner.EPI_example()
    obj = brain_phantom2D()
    parts = kfoldperm(length(obj), nworkers()) # [1:10, 11:20, 21:30]
end
# Simulation
raw = @distributed (+) for i=1:nworkers()
    KomaMRICore.set_device!(i-1) #Sets device for this worker
    simulate(obj[parts[i]], seq, sys)
end
#!/bin/bash
#SBATCH --job-name KomaDistributed                 # Job name
#SBATCH -t 0-00:30                                 # Max runtime for job
#SBATCH -p batch                                   # Enter partition on which to run the job
#SBATCH --ntasks=1                                 # 1 task
#SBATCH --cpus-per-task=1                          # Request 1 CPU
#SBATCH --gpus=4                                   # Request 4 GPUs
#SBATCH -o /mnt/workspace/%u/slurm-out/%test.out   # Enter file path to write stdout to
#SBATCH -e /mnt/workspace/%u/slurm-out/%test.err   # Enter file path to write stderr to

module load julia/1.10.2
julia script.jl

New motion models (Pablo Villacorta)

Phantoms with composable Motion’s

obj.motion = MotionList(
  Translate(# Translation 1
    1.0e-2, # x-displacement
    0.0,    # y-displacement
    0.0,    # z-displacement
    TimeRange(0.0, 1.0), 
    AllSpins()
  ),
  Translate(# Translation 2
    0.0,    # x-displacement
    1.0e-2, # y-displacement
    0.0,    # z-displacement
    TimeRange(1.0, 2.0), 
    AllSpins()
  ),
)

Uniformly-sampled Path’s

using Random
rng = MersenneTwister(1234)
dx = cumsum(randn(rng, Nspins, Nt - 1); dims=2)
dy = cumsum(randn(rng, Nspins, Nt - 1); dims=2)
dz = cumsum(randn(rng, Nspins, Nt - 1); dims=2)
random_walk = Path(
  dx, 
  dy, 
  dz, 
  TimeRange(0.0, T)
)
obj.motion = MotionList(
  random_walk
)

FlowPath can use CFD-generated particle trajectories

  • 5 million spins
  • GPU NVIDIA Quadro RTX 4000 8 GB

Faster simulations than CMRsim8

KomaMRI 9 min vs CMRsim 39 min:

Distributed flow simulations across multiple GPUs

  • 7 million spins
  • 4 x NVIDIA RTX A5000 24 GB

So you can do a lot more stuff!

What’s next?

Parallel imaging (Martín Stockle)

We have a PR that implements it

Enzyme AD (Kareem Fareed)

There is a long way to go …

Conclusions

KomaMRI.jl has a growing community

  • Statistics:

  • 17 Collaborators:

Propaganda

Show support by starring KomaMRI.jl on GitHub ⭐!

Join JuliaHealth: #health-and-medicine on the Julia Slack.

Acknowledgements

  • Pablo Irarrázaval (UC)
  • René Botnar (KCL/UC)
  • Claudia Prieto (KCL/UC)
  • Jacob Zelko (JuliaHealth)
  • Ryan Kierulf (GSoC)
  • Boris Orostica (iHEALTH, UC)
  • Martín Stockle (iHEALTH, UC)
  • Kareem Fareed (REU, Stanford)
  • Dilum Aluthge (Julia)
  • Tim Besard (Julia)
  • and many more!

  1. American Physical Society DCOMP Travel Award
  2. Google Summer of Code 2024
  3. NIH R01 HL131823
  4. NIH R01 HL152256

References

1.
Castillo-Passi C, Kunze KP, Crabb MG, et al. Highly efficient image navigator based 3D whole-heart cardiac MRA at 0.55T. Magnetic Resonance in Medicine. 2025;93(2):689-698. doi:10.1002/mrm.30316
2.
Castillo-Passi C, Coronado R, Varela-Mattatall G, Alberola-López C, Botnar R, Irarrazaval P. KomaMRI.jl: An open-source framework for general MRI simulations with GPU acceleration. Magnetic Resonance in Medicine. 2023;90(1):329-342. doi:10.1002/mrm.29635
3.
Layton KJ, Kroboth S, Jia F, et al. Pulseq: A rapid and hardware-independent pulse sequence prototyping framework. Magnetic Resonance in Medicine. 2017;77(4):1544-1552. doi:10.1002/mrm.26235
4.
Inati SJ, Naegele JD, Zwart NR, et al. ISMRM Raw data format: A proposed standard for MRI raw datasets. Magnetic Resonance in Medicine. 2017;77(1):411-421. doi:10.1002/mrm.26089
5.
Bezanson J, Edelman A, Karpinski S, Shah VB. Julia: A Fresh Approach to Numerical Computing. SIAM Review. 2017;59(1):65-98. doi:10.1137/141000671
6.
Besard T, Churavy V, Edelman A, Sutter BD. Rapid software prototyping for heterogeneous and distributed platforms. Advances in Engineering Software. 2019;132:29-46. doi:10.1016/j.advengsoft.2019.02.002
7.
Besard T, Foket C, De Sutter B. Effective Extensible Programming: Unleashing Julia on GPUs. IEEE Transactions on Parallel and Distributed Systems. 2019;30(4):827-841. doi:10.1109/TPDS.2018.2872064
8.
Weine J, McGrath C, Dirix P, Buoso S, Kozerke S. CMRsimA python package for cardiovascular MR simulations incorporating complex motion and flow. Magnetic Resonance in Medicine. 2024;91(6):2621-2637. doi:10.1002/mrm.30010